121 research outputs found

    Text-to-SQL Empowered by Large Language Models: A Benchmark Evaluation

    Full text link
    Large language models (LLMs) have emerged as a new paradigm for Text-to-SQL task. However, the absence of a systematical benchmark inhibits the development of designing effective, efficient and economic LLM-based Text-to-SQL solutions. To address this challenge, in this paper, we first conduct a systematical and extensive comparison over existing prompt engineering methods, including question representation, example selection and example organization, and with these experimental results, we elaborate their pros and cons. Based on these findings, we propose a new integrated solution, named DAIL-SQL, which refreshes the Spider leaderboard with 86.6% execution accuracy and sets a new bar. To explore the potential of open-source LLM, we investigate them in various scenarios, and further enhance their performance with supervised fine-tuning. Our explorations highlight open-source LLMs' potential in Text-to-SQL, as well as the advantages and disadvantages of the supervised fine-tuning. Additionally, towards an efficient and economic LLM-based Text-to-SQL solution, we emphasize the token efficiency in prompt engineering and compare the prior studies under this metric. We hope that our work provides a deeper understanding of Text-to-SQL with LLMs, and inspires further investigations and broad applications.Comment: We have released code on https://github.com/BeachWang/DAIL-SQ

    A simple microscopy setup for visualizing cellular responses to DNA damage at particle accelerator facilities

    Get PDF
    Cellular responses to DNA double-strand breaks (DSBs) not only promote genomic integrity in healthy tissues, but also largely determine the efficacy of many DNA-damaging cancer treatments, including X-ray and particle therapies. A growing body of evidence suggests that activation of the mechanisms that detect, signal and repair DSBs may depend on the complexity of the initiating DNA lesions. Studies focusing on this, as well as on many other radiobiological questions, require reliable methods to induce DSBs of varying complexity, and to visualize the ensuing cellular responses. Accelerated particles of different energies and masses are exceptionally well suited for this task, due to the nature of their physical interactions with the intracellular environment, but visualizing cellular responses to particle-induced damage - especially in their early stages - at particle accelerator facilities, remains challenging. Here we describe a straightforward approach for real-time imaging of early response to particle-induced DNA damage. We rely on a transportable setup with an inverted fluorescence confocal microscope, tilted at a small angle relative to the particle beam, such that cells can be irradiated and imaged without any microscope or beamline modifications. Using this setup, we image and analyze the accumulation of fluorescently-tagged MDC1, RNF168 and 53BP1—key factors involved in DSB signalling—at DNA lesions induced by 254 MeV α-particles. Our results provide a demonstration of technical feasibility and reveal asynchronous initiation of accumulation of these proteins at different individual DSBs

    Computer-Aided Diagnosis Evaluation of the Correlation Between Magnetic Resonance Imaging With Molecular Subtypes in Breast Cancer

    Get PDF
    BackgroundThere is a demand for additional alternative methods that can allow the differentiation of the breast tumor into molecular subtypes precisely and conveniently.PurposeThe present study aimed to determine suitable optimal classifiers and investigate the general applicability of computer-aided diagnosis (CAD) to associate between the breast cancer molecular subtype and the extracted MR imaging features.MethodsWe analyzed a total of 264 patients (mean age: 47.9 ± 9.7 years; range: 19–81 years) with 264 masses (mean size: 28.6 ± 15.86 mm; range: 5–91 mm) using a Unet model and Gradient Tree Boosting for segmentation and classification.ResultsThe tumors were segmented clearly by the Unet model automatically. All the extracted features which including the shape features,the texture features of the tumors and the clinical features were input into the classifiers for classification, and the results showed that the GTB classifier is superior to other classifiers, which achieved F1-Score 0.72, AUC 0.81 and score 0.71. Analyzed the different features combinations, we founded that the texture features associated with the clinical features are the optimal features to different the breast cancer subtypes.ConclusionCAD is feasible to differentiate the breast cancer subtypes, automatical segmentation were feasible by Unet model and the extracted texture features from breast MR imaging with the clinical features can be used to help differentiating the molecular subtype. Moreover, in the clinical features, BPE and age characteristics have the best potential for subtype

    Shallow Crustal Structure of S-Wave Velocities in the Coastal Area of South China Constrained by Receiver Function Amplitudes

    No full text
    As a traditional method, passive seismic exploration is used to construct the body-wave velocity structure of the upper crust, but it is cost-ineffective and depth-limited when applied to large areas. In this study, we use another more economical method to determine the S-wave velocity (SWV) of the upper crust based on the principle that the amplitude of the direct P-wave on the teleseismic receiver function is sensitive to the upper crust. Using the amplitudes of the massive receiver functions from permanent broadband seismic stations, the SWV structure of the upper crust is obtained in the coastal area of South China (CASC). A pattern of high to low SWVs is exhibited across the study area, with SWVs varying about 2.5–3.7 km/s from west to east. In the profile parallel to the coastline, lateral variations in the SWV correspond to the fault zone, indicating that the cutting depth of most coastal faults is approximately 10 km. Referring to previous studies, we deduce that the low SWV in most sub-areas can be interpreted as the joint effect of the sedimentary layer of the alluvial plain and the accumulation of underground heat flows, in addition to multistage fracturing tectonism. Moreover, the gradual change in the SWV in each profile from the surface to approximately 10 km is correlated with multiple invasions and the coverage of volcanic rocks, to a certain extent

    Study on Detection and Classification of Tetracycline Residue in Duck Meat Using Synchronous Fluorescence Spectra and Support Vector Machine

    No full text
    To the rapid detection of whether the tetracycline residues are excess in duck meat, the optimum characteristic wavelength difference λ was determined by synchronous fluorescence analytical method. The recognition model of different residual levels of tetracycline was established by using support vector machine classification algorithm. Firstly, the optimum wavelength difference λ for duck meat samples was determined as 70nm, and synchronous fluorescence spectra of different samples under the condition of λ 70nm were collected. Secondly, original synchronous fluorescence spectra were preprocessed by using standard normal variables change (SNV). Finally, 18 wavelength variables were selected from 121 wavelength variables of pretreatment spectra by using competitive adaptive reweighted sampling (CARS). Then the radial basis function (RBF) was selected as the kernel function of support vector classification (SVC), and the optimal kernel function factor C and g were determined as 2.83 and 1, respectively, which were obtained by using grid searching combined with 5-fold cross validation. The classification model of SNV-CARS-SVC was established, and the classification accuracy rate of the model was 95.7% for prediction sets samples. The results showed that the synchronous fluorescence analysis method could identify tetracycline different residual levels quickly and accurately, and a feasible method was provided for identifying the quality of duck meat

    Shallow Crustal Structure of S-Wave Velocities in the Coastal Area of South China Constrained by Receiver Function Amplitudes

    No full text
    As a traditional method, passive seismic exploration is used to construct the body-wave velocity structure of the upper crust, but it is cost-ineffective and depth-limited when applied to large areas. In this study, we use another more economical method to determine the S-wave velocity (SWV) of the upper crust based on the principle that the amplitude of the direct P-wave on the teleseismic receiver function is sensitive to the upper crust. Using the amplitudes of the massive receiver functions from permanent broadband seismic stations, the SWV structure of the upper crust is obtained in the coastal area of South China (CASC). A pattern of high to low SWVs is exhibited across the study area, with SWVs varying about 2.5–3.7 km/s from west to east. In the profile parallel to the coastline, lateral variations in the SWV correspond to the fault zone, indicating that the cutting depth of most coastal faults is approximately 10 km. Referring to previous studies, we deduce that the low SWV in most sub-areas can be interpreted as the joint effect of the sedimentary layer of the alluvial plain and the accumulation of underground heat flows, in addition to multistage fracturing tectonism. Moreover, the gradual change in the SWV in each profile from the surface to approximately 10 km is correlated with multiple invasions and the coverage of volcanic rocks, to a certain extent
    • …
    corecore